Picture for Tao Jin

Tao Jin

Andrew

ROGLE: Robust Global-Local Alignment with Automated Region Supervision for Text-Based Person Search

Add code
Jun 01, 2026
Viaarxiv icon

DocRetriever: A Plug-and-Play Framework for Multimodal Document Retrieval with Comprehensive Benchmark

Add code
May 28, 2026
Viaarxiv icon

From Facts to Insights: A Persona-Driven Dual Memory Framework and Dataset for Role-Playing Agents

Add code
May 25, 2026
Viaarxiv icon

Bridging the Pose-Semantic Gap: A Cascade Framework for Text-Based Person Anomaly Search

Add code
Apr 25, 2026
Viaarxiv icon

Character Beyond Speech: Leveraging Role-Playing Evaluation in Audio Large Language Models via Reinforcement Learning

Add code
Apr 15, 2026
Viaarxiv icon

A Progressive Training Strategy for Vision-Language Models to Counteract Spatio-Temporal Hallucinations in Embodied Reasoning

Add code
Apr 12, 2026
Viaarxiv icon

From Perception to Planning: Evolving Ego-Centric Task-Oriented Spatiotemporal Reasoning via Curriculum Learning

Add code
Apr 12, 2026
Viaarxiv icon

ImVideoEdit: Image-learning Video Editing via 2D Spatial Difference Attention Blocks

Add code
Apr 09, 2026
Viaarxiv icon

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

Add code
Apr 02, 2026
Viaarxiv icon

Goose: Anisotropic Speculation Trees for Training-Free Speculative Decoding

Add code
Apr 02, 2026
Viaarxiv icon